{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 配对设计资料的符号秩和检验\n",
    "\n",
    "## 案例\n",
    "\n",
    "研究白癜风病人的 IL-6 指标（的中位数）在白斑部位和正常部位有无差异。\n",
    "\n",
    "### 数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (8, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>pathient</th><th>pathological</th><th>normal</th></tr><tr><td>i64</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>1</td><td>40.03</td><td>88.57</td></tr><tr><td>2</td><td>97.13</td><td>88.0</td></tr><tr><td>3</td><td>80.32</td><td>123.72</td></tr><tr><td>4</td><td>25.32</td><td>39.03</td></tr><tr><td>5</td><td>19.61</td><td>24.37</td></tr><tr><td>6</td><td>14.5</td><td>192.75</td></tr><tr><td>7</td><td>49.63</td><td>121.57</td></tr><tr><td>8</td><td>44.56</td><td>89.76</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (8, 3)\n",
       "┌──────────┬──────────────┬────────┐\n",
       "│ pathient ┆ pathological ┆ normal │\n",
       "│ ---      ┆ ---          ┆ ---    │\n",
       "│ i64      ┆ f64          ┆ f64    │\n",
       "╞══════════╪══════════════╪════════╡\n",
       "│ 1        ┆ 40.03        ┆ 88.57  │\n",
       "│ 2        ┆ 97.13        ┆ 88.0   │\n",
       "│ 3        ┆ 80.32        ┆ 123.72 │\n",
       "│ 4        ┆ 25.32        ┆ 39.03  │\n",
       "│ 5        ┆ 19.61        ┆ 24.37  │\n",
       "│ 6        ┆ 14.5         ┆ 192.75 │\n",
       "│ 7        ┆ 49.63        ┆ 121.57 │\n",
       "│ 8        ┆ 44.56        ┆ 89.76  │\n",
       "└──────────┴──────────────┴────────┘"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import polars as pl\n",
    "\n",
    "df = pl.read_csv(\"B_10_1-data.csv\")\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 分析\n",
    "\n",
    "结果变量为数值变量，影响变量为二项分类变量，且两个整体的数据之间可以两两配对。可采用配对设计资料的符号秩和检验。\n",
    "\n",
    "理论上，也可采用配对样本均数 t 检验，且可以保留更多信息。见 [7-1_样本均数t检验](../B%20Chapter%2007/B_07_1.ipynb)\n",
    "\n",
    "### 检验\n",
    "\n",
    "#### 假设\n",
    "\n",
    "$ H_0 $: 两个不同部位 IL-6 水平差值的总体中位数为零。  \n",
    "$ H_1 $: 两个不同部位 IL-6 水平差值的总体中位数不为零\n",
    "\n",
    "#### 假设检验\n",
    "\n",
    "##### 1. Wilcoxon signed-rank test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "T = 2.0\n",
      "P = 0.0234375\n",
      "significance: True\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from scipy.stats import wilcoxon\n",
    "\n",
    "res = wilcoxon(df.select(\"normal\"), df.select(\"pathological\"), method=\"exact\")\n",
    "\n",
    "print(\n",
    "f\"\"\"T = {float(res.statistic.sum())}\n",
    "P = {float(res.pvalue.sum())}\n",
    "significance: {float(res.pvalue.sum()) < 0.05}\n",
    "\"\"\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$ P \\leqslant 0.05 $，按 $ \\alpha = 0.05 $ 水准，拒绝 $ H_0 $，认为白斑部位与正常部位的 IL-6 水平差异有统计学意义。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### 2. Paired t test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "t statistics = 2.4023027654607882\n",
      "degree of freedom = 7.0\n",
      "P = 0.04730584037128341\n",
      "significance: True\n"
     ]
    }
   ],
   "source": [
    "from scipy.stats import ttest_rel\n",
    "\n",
    "res = ttest_rel(df.select(\"normal\"), df.select(\"pathological\"))\n",
    "\n",
    "print(\n",
    "    f\"\"\"t statistics = {float(res.statistic.sum())}\n",
    "degree of freedom = {float(res.df.sum())}\n",
    "P = {float(res.pvalue.sum())}\n",
    "significance: {float(res.pvalue.sum()) < 0.05}\"\"\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$ P \\leqslant 0.05 $，按 $ \\alpha = 0.05 $ 水准，拒绝 $ H_0 $，认为白斑部位与正常部位的 IL-6 水平差异有统计学意义。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
